Random Walks on Directed Networks: Inference and Respondent-driven Sampling
نویسندگان
چکیده
Respondent driven sampling (RDS) is a method often used to estimate population properties (e.g. sexual risk behavior) in hard-to-reach populations. It combines an effective modified snowball sampling methodology with an estimation procedure that yields unbiased population estimates under the assumption that the sampling process behaves like a random walk on the social network of the population. Current RDS estimation methodology assumes that the social network is undirected, i.e. that all edges are reciprocal. However, empirical social networks in general also have non-reciprocated edges. To account for this fact, we develop a new estimation method for RDS in the presence of directed edges on the basis of random walks on directed networks. We distinguish directed and undirected edges and consider the possibility that the random walk returns to its current position in two steps through an undirected edge. We derive estimators of the selection probabilities of individuals as a function of the number of outgoing edges of sampled individuals. We evaluate the performance of the proposed estimators on artificial and empirical networks to show that they generally perform better than existing methods. This is in particular the case when the fraction of directed edges in the network is large.
منابع مشابه
Comparison of Random Walk Based Techniques for Estimating Network Averages
Function estimation on Online Social Networks (OSN) is an important field of study in complex network analysis. An efficient way to do function estimation on large networks is to use random walks. We can then defer to the extensive theory of Markov chains to do error analysis of these estimators. In this work we compare two existing techniques, Metropolis-Hastings MCMC and Respondent-Driven Sam...
متن کاملMining Inference Formulas by Goal-Directed Random Walks
Deep inference on a large-scale knowledge base (KB) needs a mass of formulas, but it is almost impossible to create all formulas manually. Data-driven methods have been proposed to mine formulas from KBs automatically, where random sampling and approximate calculation are common techniques to handle big data. Among a series of methods, Random Walk is believed to be suitable for knowledge graph ...
متن کاملRespondent-Driven Sampling in Online Social Networks
Abstract. Respondent-driven sampling (RDS) is a commonly used method for acquiring data on hidden communities, i.e., those that lack unbiased sampling frames or face social stigmas that make their members unwilling to identify themselves. Obtaining accurate statistical data about such communities is important because, for instance, they often have different health burdens from the greater popul...
متن کاملIdentifying Communities and Key Vertices by Reconstructing Networks from Samples
Sampling techniques such as Respondent-Driven Sampling (RDS) are widely used in epidemiology to sample "hidden" populations, such that properties of the network can be deduced from the sample. We consider how similar techniques can be designed that allow the discovery of the structure, especially the community structure, of networks. Our method involves collecting samples of a network by random...
متن کاملNew Survey Questions and Estimators for Network Clustering with Respondent-Driven Sampling Data
Respondent-driven sampling (RDS) is a popular method for sampling hard-to-survey populations that leverages social network connections through peer recruitment. While RDS is most frequently applied to estimate the prevalence of infections and risk behaviors of interest to public health, like HIV/AIDS or condom use, it is rarely used to draw inferences about the structural properties of social n...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1308.3600 شماره
صفحات -
تاریخ انتشار 2013